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Abstract. The properties of the first (largest) eigenvalue and its eigenvector (first 
eigenvector) are investigated for large sparse random symmetric matrices that are 
characterized by bimodal degree distributions. In principle, one should be able 
to accurately calculate them by solving a functional equation concerning auxiliary 
fields which come out in an analysis based on replica/cavity methods. However, the 
difficulty in analytically solving this equation makes an accurate calculation infeasible 
in practice. To overcome this problem, we develop approximation schemes on the basis 
of two exceptionally solvable examples. The schemes are reasonably consistent with 
numerical experiments when the statistical bias of positive matrix entries is sufficiently 
large, and they qualitatively explain why considerably large finite size effects of the 
first eigenvalue can be observed when the bias is relatively small. 
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1. Introduction 

Since their introduction by Wigner for approximating the complex Hamiltonian of heavy 
nuclei, random matrices have been used in various fields of physics and other disciplines. 
The list of applications includes nuclear theory [I], quantum chaos [2], localization 
in electron systems [3], finance [I], complex networks [6], wireless communication [5], 
combinatorial problems in computer science [7J, and more. 

In general, the purpose of random matrix theory (RMT) is to investigate the 
statistical properties of physical quantities that are defined by samples drawn from an 
ensemble of N x N random matrices. A major topic is the evaluation of the asymptotic 
eigenvalue spectrum, which is the typical distribution of eigenvalues as N — > oo. 
For a Gaussian orthogonal ensemble (GOE), whose matrix entries are independently 
distributed obeying identical Gaussian distributions of zero mean, the spectrum follows 
the Wigner semicircle distribution [8]. Another type of asymptotic spectrum, termed the 
Marcenko-Pastur distribution, comes from the covariance matrix of rectangular matrices 
whose entries are independently sampled from identical Gaussian distributions of zero 
mean [9]. Recent developments on sparsely connected disordered systems have led to 
significant progress in being able to analyze the spectrum of sparse random matrices 

Evaluation of the first (largest) eigenvalue and its eigenvector (first eigenvector) 
is another major topic of RMT. For a GOE, the first eigenvalue converges to 2 when 
the variance of the matrix entries is provided as iV -1 in the limit of N — > oo, and 
the finite size correction follows the Tracy- Widom distribution when N is large but 
finite [22]. As for the covariance matrix of dense random rectangular matrices, the 
asymptotic behaviors of the first eigenvalue/eigenvector have been examined analytically 
and numerically in situations where one can set the strength of preferential directions 
underlying the random rectangular matrices [231 EI]- For sparse matrices, convergence to 
the Tracy- Widom distribution was recently proved for the first eigenvalue in the case of 
fixed degrees, which denote the numbers of nonzero entries per row/column in matrices, 
and entries of random signs [25]. There are also various studies on the second eigenvalue 
of adjacency matrices of fixed degrees [261 121] • However, as far as the authors know, 
the first eigenvalue problem for ensembles of sparse matrices has not been sufficiently 
examined yet, despite there being analyses of their spectrum. Moreover, the need for 
an accurate solution to the first eigenvalue problem seems to be growing, because the 
first eigenvector is useful for extracting valuable information in the field of data analysis 
[281 [29] and in constructing approximate solutions of various combinatorial problems 
[3 [30]. 

In light of this potential need, we herein investigate the asymptotic properties of the 
first eigenvalue/eigenvector for ensembles of sparse symmetric matrices. A preliminary 
investigation indicated that the properties are considerably influenced by the fluctuation 
of degrees [31] . In order to be able to control the influence of the degree fluctuation and 
the properties of the first eigenvector in a simple manner, we focus on matrix ensembles 
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that are characterized by a bimodal degree distribution and a biased binary distribution 
of nonzero matrix entries. 

This article is organized as follows. In the next section, we explain the model 
that we will examine. Section [3] introduces the methodological bases for analyzing the 
model. Although the model we investigate seems quite simple, analyzing it exactly in 
general situations is technically difficult. Nevertheless, one can still analytically solve 
the problem in two specific cases, which are shown in section HI On the basis of lessons 
derived from the solvable cases, we develop approximate assessment schemes for handling 
a more general situation in section |5j The final section is devoted to a summary. 

Some contents in the following are shared with a conference paper [31]. Precisely, 
the model to examine is identical, and the methodology in section 3.2 and the solvable 
example in section 4.1 were shown for the first time in the conference paper. The other 
parts are, however, newly provided in the present article. 

2. Model definition 

We consider a sparse network of N nodes indexed by i = 1,2, . . . , N. The network is 
characterized by a bimodal distribution p{k) = pib~k, Cl + P20~k,c 2 of degree k{= 0, 1, 2, . . .), 
which stands for the number of links from each node to other nodes. Here, we assume 
that p±,p 2 £ [0, 1] satisfy p± + p 2 = 1 and C\ < c 2 . Moreover, 5 XiV = 1 if x = y, and it 
vanishes, otherwise. We denote the average degree as c = p\C\ +P2C2, and suppose that 
the network is constructed randomly for aspects other than degree. A practical scheme 
for generating such a network is basically as follows [32J: 

(S) Set ki = c\ and ki = c 2 for Npi and Np 2 indices of % — 1, 2, . . . , N, respectively, 
and make a set of indices U to which each index % attends ki times. Accordingly, 
steps (A)-(C) are iterated as follows. 

(A) Choose a pair of two different elements from U randomly. 

(B) Denote the values of the two elements as % and j. If i ^ j and the pair of % 
and j have not been chosen up to that moment, make a link to the pair, and 
remove the two elements from U. Otherwise, return them back to U. 

(C) If U becomes empty, finish the iteration. Otherwise, if there is no possibility 
that any more links can be made by (A) and (B), return to (S). 

Once we have generated the network, we assign entries Jy = ±1 to the links 
generated by the above procedure, sampling ±1 randomly and independently from a 
biased binary distribution: 

Pj(Jij\ A ) = ^-i . 1 + ^2^,-1- (1) 

We set Jij — if i and j are not connected in the network. This yields a sample 
J = (Jij) of sparse random symmetric matrices that we will focus on. 
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The objective of our study is to investigate how the properties of the first eigenvalue 
A/eigenvector V = (Vi, V2, ■ ■ ■ , Viv) of the random matrix J depend on the system 
parameters c\,C2,pi = 1 — P2 and A as N tends to infinity. A simple consideration 
guarantees that A is upper bounded by C2 for any realization of J ( Appendix A ). On 
the other hand, when pi — 1 and A = 1, which means each row/column of J has 
Ci entries of unity exactly, A = c\ and V oc (1,1,..., 1) T hold, where T denotes the 
matrix transpose operation. This implies that the inverse participation ratio (IPR) of V, 
IPR = (X)i=i Vf) / Cl2iLi Vi 2 )' ''■> converges to zero, and therefore V extends over almost 
all nodes as N tends to infinity in the vicinity of this parameter setting. However, 
earlier studies have indicated that V can be localized in the vicinity of a few nodes 
being characterized by a finite IPR when a small number of nodes of larger degree C2 
are added to the sparse network and if C2 is sufficiently large [TT], [12l [TTJ . One of our 
interests is to clarify how such a change in the profile of V is related to the value of A. 



3. Analytical bases: replica and cavity methods 

3.1. Replica method 

Formulating the first eigenvalue problem as 

A=^nmx{v T Jv} subj. to \v\ 2 = N, (2) 

will form the basis of our analysis. Here, maxx{/(X)} denotes maximization of a 
function f(X) with respect to X. The solution to this problem accords with V. 
Identifying — (l/2)v T Jv as the Hamiltonian of the dynamical variable v yields the 
partition function, 

J) = J dv5 (\vf- N) exp (^^) - (3) 

This offers another way to express the first eigenvalue: A = 2 lim^_ >00 (A r /3)~ 1 In Z((3; J). 
The typical first eigenvalue can be obtained by averaging the logarithm of the partition 
function over the random matrix J. 

The above considerations naturally lead one to consider trying a solution using the 
replica method |33j. Hereafter, let us generally denote [0(AT)]x as the average of 0(X) 
with respect to random variable X. In the replica method, we first evaluate analytical 
expressions of the moment of Z((3;J), [Z n ((3; J)]j, for Vn = 1,2,... G N utilizing 
an identity Z n (/3; J) = J (Ua=i dv a 5(\v a \ 2 - N)) x exp ((/3/2) Ea=iK) Tj ^ a ) ; which 
is valid for only n G N. The integration variables v a (a = 1,2, ... , n) are sometimes 
termed "replicas" since they can be regarded as n copies of the original variable v 
that share the identical external random coupling J. Although the identity is valid for 
only nGN, the expressions of [Z n ((3; J)]j evaluated with the saddle point method for 
N ^> 1 under appropriate assumptions about the permutation symmetry of the replica 
indices a = 1,2, ... ,n are likely to hold for n G M as well. Therefore, we can employ 
the analytical expressions for computing the average of the logarithm of the partition 
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function by utilizing the identity N' 1 [In Z(/3; J)}j = lim n _ >0 (9/9n)iV- 1 In [Z n (/3; J)]j. 
In particular, under the replica symmetric (RS) ansatz, which implies that the saddle 
point is invariant under any permutation of the replica indices, this yields an expression 
for the typical first eigenvalue as 
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•]j denotes the average with respect to ([T]). Hereafter, we shall not distinguish 
between A and [A]j because A — > [A]j should hold with a probability of unity for 
N — y oo because of the self- averaging property. The variational functions q(A, H) and 
q(A, H) are joint distributions that come from the RS saddle point calculation, whereas 
A originates from the constraint of the 5-function in ([3]). The notation extix{f{X)} 
generally stands for extremization of f{X) with respect to X. A derivation of (J4j) — (EJ) 



is shown in Appendix B 



3.2. Cavity method 

An alternative approach, termed the cavity method [M], is of utility for understanding 
the physical implications of the seemingly artificial extremization variables q(A,H), 
q{A, H) and A. In the spirit of mean field theory, directly approximating the multivariate 
optimization problem of (T5]) by a bunch of single-variable problems as 

max { -Aiv] + IH^i } (8) 

Vi 

(i = 1, 2, . . . , N) is another promising scheme for computing A, wherein the coefficients 
Ai and if, are to be determined in a self-consistent manner. In the cavity method, this 
is done by determining the cavity fields A^j and H^j, which denote the coefficients 
of (JE|) for the j-cavity system, where a node j of the neighbor of a focused node i is 
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removed, by using the belief propagation algorithm [33 EH ES]: 

Ai-tj = A — 2^ Ak-ti, Hi^rj = ^ H k ^i- (10) 

k£di\j k£di\j 

Here, A is a Lagrange multiplier for introducing the constraint \v\ 2 = N of fl2]) while 
new auxiliary variables A^j and are sometimes termed the cavity biases, di 

denotes the neighbor of % and di\j stands for a set defined by removing node j from di. 
After determining the cavity fields/biases, the coefficients of the approximate objective 
functions are found to be 

Ai = \-J2 4fc-i, H i = J2 (11) 

kedi k£di 

In a random sparse network, the typical lengths of cycles in the network 
grow as O(lnTV), which means that the system can be locally regarded as a tree, 
ignoring any feedback effects. This allows us to characterize the macroscopic 
properties of the objective system by utilizing the distributions of the cavity 
fields/biases q(A, H) = \di\)~ l £ii 5(A-A j ^ i )5(H—Hj^ i ) and q(A, H) = 

(J2f=i l^l) -1 J2f=i Ylij£di&(A — Ai^j)5(H — Hi-+j), where |<S| stands for the number of 
elements in the set S. Equations ([9]) and (fTUl) indicate that q(A, H) and q(A, H) are 
determined in a self-consistent manner: 



q\A, H) = J dAdHq(A, H) 
q(A 



A) \ A /x 
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r 2 I [ \ dA^dH^Afj,, H^)5 \ A — X+ } A^ \ 5 \ H— y H^] , (13) 

where r\ = c\pi/c represents the probability that one terminal node has degree c\ when 
a link is chosen randomly from the connectivity network and similarly for r 2 = C2P2/C. 
On the other hand, fill I) means that the joint distribution of A t and Hi of (JSJ) is 

/Cl / Cl \ / Cl 

HdA^dH^q(A^Hj5 {A-X + J^aA 5 Ih-J^H, 

/C2 / C2 \ / C2 \ 

which leads to the extremization condition with respect to the Lagrange multiplier: 

1 = J dAdHQ(A, H) . (15) 

It is noteworthy that (fT2"|) . f Tl~3]) and (TT5T) exactly constitute the extremization 
condition of (Jlj). This allows us to interpret q(A, H), q(A, H) and A in 01]) as distributions 
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of the cavity fields/biases and the Lagrange multiplier, respectively. This interpretation 
indicates that the supports of q(A,H), q(A,H) and Q(A,H) cannot be extended to 
the region of neither A < nor A < in order to make the approximate single body 
maximization problems flS} well-posed. This condition plays a crucial role in the later 
analysis. 



4. Two solvable examples 

Now we are ready to tackle the first eigenvalue problem. However, solving the problem 
exactly is still difficult since it involves functional equations of ( fl2|) and (fl3|) . Therefore, 
we shall first analyze two solvable examples in order to get insights into constructing 
appropriate approximation schemes. 



4-1. Single- degree model 

The first example is the case in which pi = 1 exactly holds, which means that all 
nodes possess the same degree c\. We will refer to this example as the single-degree 
model. Equations (TT21) and (1131) imply that marginal distributions q(A) = J dHq(A, H) 
and q\A) = J dHq(A, H) generally constitute a set of closed equations while q(H) = 
J dAq(A, H) and q{H) = J dAq[A, H) do not. In particular, in the case of p\ = 1, for 
which n = 1 and r2 = hold, this allows us to assume that the distributions are of the 
forms q{A) = 5 (A — a) and q(A) = 8{A — a). Inserting these into (j4j) yields 

f / am 2 + Am?\ /m2 + 2m 1 m 1 + m 2 

A = extr <^ d - ci 

[\a z — 1 / \ a — a 

+ Cl (m 2 -mj) + ^ + l 



A — C\a 

where mi and m 2 are the first and second moments (about the origin) with respect to 
q(H \A) = q{H), and similarly for fhi and m 2 . The extremization is carried out with 
respect to all variables except for A. After some algebra, the extremization conditions 
of (TiT)|) can be summarized as 



1 



mi 



A — (ci — l)a' 
A(ci — l)mi 



;i7) 



A — (ci — l)a 
ci - l)(ci - 2)m{ ( ci 



(A-( Cl -l)a) 2 V (A-(d-l)a)^ 



m 2 , (19) 



Ci(m2 — mf) + c\fh\ 



1. (20) 



(A — c{a) 2 

Equation (Tl8|) indicates that the solutions can be classified into two types depending on 
whether rhi vanishes or not: 
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Figure 1. (Color online) Theoretical predictions and experimental results for the 



3. (a) The first eigenvalue, (b) M = N~ 



Eli* 



of 



single-degree model of c\ 
the first eigenvector V = (Vi). Symbols represent averages over fOOO experiments 
for N = 250, 500, and 1000 systems from the bottom and the top in (a) and (b), 
respectively. The solid curves are the theoretical predictions ([2"3")l and (|25[) for (a) and 
(b), respectively, (c) Symbols denote IPR of V for A = and 1 from the top. A slope 
of OIN^ 1 ) is shown as a broken line for reference. 



mi 7^ 0: Equation ([15]) means that A(ci — 1)/(A — (ci — l)a) = 1 holds for fhi ^ 0. 
This, in conjunction with (117j) . gives 

A=(d-l)A+i (21) 

and a = l/(A(ci — 1)). Inserting these values into (TI9]) and (|20|) yields nonzero 
values of mi and m2, and the positivity of m 2 makes this solution valid only for 
A > A c = 1/Vci - !■ 

mi = 0: Equation (fT9|) means that (ci — 1)/(A — (ci — l)a) 2 = 1 holds for fhi = 0. 
This, in conjunction with (I17j) . gives 



A = 2VcT r l, (22) 



and a = \j\fc\ — 1. Inserting these and mi = into ( 120|) yields the value of m2. 

In both cases, A = A after extremization. Therefore, the first eigenvalue of the single- 
degree model can be written as 

A = f ( Cl -1)A + l/A, A>A C . 

Inserting the functional forms of q(A) = 5 (A — a) and q(A) = 5 (A — 
a) into (12) and (13) yields a self-consistent equation for q(H) as q(H) = 
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/n^Li dH^qiJIfj) 5{H — YTu=i J^H^/a) . This indicates that the moment 

J {Jn} 

generating function of q(H), gnit) = f dHq(H)e tH , satisfies a relation = 
(^-^gH(t/a) + ^-^gH(—t/a)) Cl . The first and second moments, mi and m 2 , of g(-f^) 
are determined by solving the extremization problem of (16). For higher moments of 
degree n > 3, taking n-th derivative of guif) at t = offers a formula that evaluates 
n-th moment, m n , of q{H) from the moments of lower degrees m^mj, . . . ,m n _i. In 
particular, the formulae for and 7774 are provided as = (a 3 — (ci — l)) -1 (ci — 
l)(c! — 2) (37712777,! + (ci — 3)m^) and 
(d-lXd-2) 
a 4 - (ci - 1) 

x (3?772 + 4m 3 ?77i + 6(ci - 3)m 2 ml + (ci - 3)(ci - 4)m*) , (24) 

respectively. These guarantee that moments of q(H) are finite at least up to the fourth 
degree. As n-th moment of the distribution of entries of the first eigenvector P(V) = 
N- 1 Ef=i W - Vi)]j = J UU dH,q(H,) \S(V - E? =1 W(a(\ - Cl /a)))\ 

L - 1 {J ti} 

can be evaluated from 777,1,7772, ■ ■ ■ , 777 n , this indicates that the fourth moment of P(V) 
is finite, and therefore IPR of the single-degree model vanishes as 0(A r_1 ) as iV — > 00. 
The above computation also implies that the first moment of P(V) is given as 

M= { ci(ci - ^A^/CCd - 1) 2 A 2 - 1), A > A c , 

\ 0, A < A c . 1 J 

This indicates that V for A > A c is macroscopically polarized in the direction 
of (1,1,..., 1) T , although the objective function v T Jv is invariant under the 
transformation of v — > —v. This is analogous to the spontaneous symmetry breaking 
observed in models of ferromagnetism, and therefore we will term the solutions of 
this type ferromagnetic solutions. On the other hand, (122]) corresponds to the critical 
condition that equation ffTTj) . which is a function of a, has complex solutions for a given 
A. The complex solutions of a are generally associated with the eigenvalue distribution 
of J [16], and (jTTI) actually matches the larger band edge of the asymptotic eigenvalue 
spectrum of the single-degree model. Therefore, solutions of this type will be referred 
to as band edge solutions. 



4-2. Defect model 

Another solvable model can be created by adding only one node of a larger degree 
C2 > Ci to the single-degree model. We refer to the larger degree node as the center, 
indexed as i = 0. Let us pay attention to the tree structure rooted at the center 0. 
In the following, we approximately handle the network as an infinite tree rooted at 
since feedback effects are expected to be negligible in large random sparse networks as 
mentioned in section 3.2. Equations © and (fTUj) indicate that for a given A, all of the 
^4-cavity biases heading for the center are given as the smaller solution of (11711 . Using 
this, the second order coefficient of the center node is provided as 

A = A - c 2 a. (26) 
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This means that A > is a required condition for determining the first eigenvalue, 
yielding 

A = A= (27) 

V C 2 - Ci + 1 



The expression of (J27j) was also obtained recently in a mathematically rigorous manner 
for A = 1 [37J. We will term solutions of this type defect solutions because of their 
physical implications shown by the following naive analysis [TTj . 

Since the tree is free of cycles, one can always convert the eigenvalue problem into 
one for unit nonzero entries of = 1 by making a gauge transformation. Let us denote 
the distance between node i and as d. Due to the spatial symmetry, the entries of 
the first eigenvector V{ only depend on d. Therefore, we will rewrite their values as Vd, 
which allows us to express the eigenvalue equation as 

AVo = c 2 V h 

AV d = V d _i + (ci - l)V d+1 (d=l,2,...) (28) 

after the gauge transformation. 

The Perron- Frobenius theorem indicates that Vd is of an identical sign for \/d > 0. 
In addition, the normalization constraint of \V\ 2 = N requires the boundary condition 
lim^oo (ci — l) d V d 2 < oo. This choice of solution reproduces the expression of the first 
eigenvalue (]2"T|) and leads to 

V d = (c 2 - Cl + iy d / 2 V . (29) 

This indicates that the first eigenvector is localized in the vicinity of the center yielding 
a finite IPR, 

IPR = (C2 = 2Cl + 2)2(C2 = Cl + 2) (30) 
4(c 2 - Cl + 1) ((c 2 - Cl + If - Cl + 1) ' [6V) 

while the first eigenvector for the single-degree model extends over all nodes making 
IPR vanish as iV — > oo for both the ferromagnetic and band edge solutions. 

The analysis shown above means that the existence of a few nodes of larger degree 
can change the first eigenvalue/eigenvector significantly, as also pointed out in earlier 
literature [ITJ [12]. Nodes of sufficiently larger degree act as defects receiving more 
cavity biases than nodes of their surroundings, which boosts the first eigenvalue due 
to the positivity condition of ( 12 6 j) creating a localized eigenvector. In the current case, 
this occurs for sufficiently small A if c 2 > 2(ci — 1) and for VA < 1 if c 2 > ci(c\ — 1). 
Figure [2] compares the theoretical predictions and the results of numerical experiments 
for (a) the first eigenvalue and (b) IPR of the first eigenvector in the case of c\ = 3 
while varying c 2 from 4 to 7. The numerical experiments were carried out for randomly 
generated networks of finite sizes while the theory is based on the tree approximation. 
In spite of this difference, the theoretical curves of these plots are good matches for the 
numerical ones. 

In practice, our analysis implies that sufficient precautions must be taken when one 
utilizes the first eigenvector as a heuristic solution for extracting certain information 
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Figure 2. (Color online) Theoretical predictions and experimental results for the 
single-defect models of c\ = 3 and c-i = 4 ~ 7. (a) The first eigenvalue, (b) IPR of the 
first eigenvector. The squares, triangles, asterisks, and circles of both figures represent 
averages over 100 experiments for N = 8000 (odd C2) or 8001 (even C2) systems of 
C2 = 4,5,6, and 7, respectively. The lines are theoretical predictions. 



from a sparse matrix J. Even if information on a certain preferential direction is 
embedded in J as the first eigenvector, the information can be easily hidden by adding 
only one node of a sufficiently larger degree. To avoid such possibilities, the earlier 
literature [30] suggested a preprocessing removing nodes of extraordinarily large degree. 

Equation ( 1261) indicates that the eigenvalue of the defect solution becomes larger 
as the cavity biases coming to the center increase. In addition, the right hand side 
of (1T71) shows that the cavity biases heading for the center increas as the degrees of 
surrounding nodes grow. This means that if the number of nodes of the large degree is 
fixed, the first eigenvalue will be maximized when they are aggregated in the vicinity of 
the center. As a simple model for representing such situations, let us consider cases in 
which all nodes within a certain radius g(= 0, 1,2,.. .) from the center have the larger 
degree c 2 , while the degrees of the other nodes have c\. The case of g = corresponds 
to the single-defect model. The analysis above indicates that the first eigenvalue of this 
aggregated defect model can be estimated by solving the recursive equation, 

Vi = (X/c 2 )V , 

f {X V d - V d ^)/(c 2 - 1) (d = 1, . . . ,g) 
d+1 y — Vd-i)/ (ci - 1) (d = g + l,...) 1 ) 

under the condition that V d is of an identical sign for Vd > and lim rf ^ 00 (ci — l) d V d 2 < 00. 
Given A, the solution that satisfies this condition is generally represented as Vd = 
const x (a„,(A)) d for d > g+ 1, where a* (A) is the smaller solution of ( ITT)) . The condition 
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Figure 3. (Color online) ln(Vd) versus d for the recursive equation (f3"Tj) for the case 
of ci = 3 and C2 = 7. Circles represent the solution obtained from the expression for 
1 < d < g of (pffl) . Lines stand for the slope of ln(a»(A)). (a): A is chosen so that 
ln(V 9 +i) — ln(Vj) = ln(a*(A)) holds for g = 2. Asterisks represent the correct solution 
of ([31]) for d > (g + 1) + 1 = 4. The requirement offers A = A = 4.1350. (b) The case 
of A = 2V^2 _ T = 4.8990. For A > 2y/^~l, ln(V d+1 ) - ln(V d ) > ln(a*(A)) holds for 
Vd > 0. Therefore, there is no g that satisfies (1321) . 



under which (|31~|) possesses a solution of this type is expressed as 

a* (A), (32) 



V 9 ~v g+1 



which can be used to get the first eigenvalue A of the aggregated defect model. The 
IPR of the first eigenvector also comes from ( 13~T1) . 

Figure [3] (a) illustrates how to arrived at ( 132]) . This figure characterizes the first 
eigenvalue A by the condition that the difference hifV^+i) — hi(V^) accords to the 
target value ln(a„,(A)). For A G (2-^/c! — 1, 2a/c2 — 1), the left and right terminals of 
which correspond to the band edge solutions of single-degree models of degree c\ and 
C2, respectively, the difference ln(Vrf + i) — ln(Vrf) of the solution of the expression for 
1 < d < g of ( ]3T]) (circles) can generally vary from a larger value to smaller values 
than ln(a*(A)) as d increases from 0. This is because the roots of the characteristic 
equation of the recursive equation are complex numbers, and therefore the solution 
governed by this recursive equation vanishes in the manner of a damped oscillation 
as d grows. This means that, for a given g > 0, there always exists a certain value 
of A G {2y/c\ — 1, 2a/c2 — 1) that satisfies (|32l . On the other hand, in the region of 
A > 2\/c2 — 1, the characteristic equation yields roots of positive numbers that are larger 
than ln(a*(A)), which means that ln^+i) — ln(V^) is always larger than ln(a*(A)). This 
makes it impossible for ( |32l to hold (figure [3] (b)). Consequently, the first eigenvalue of 
the aggregated defect models increases from the value of (1271) to 2^/c2 — 1 as g grows 
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Figure 4. (Color online) Log- log plot of 5 A = 2^/c2 — 1 — A versus g for the solutions 
of ADA in the case of c\ — 3 and C2 = 7 (symbols). Slopes of g _1 and g" 2 (lines) are 
plotted for reference. 



from to oo, while the IPR of the first eigenvector decreases from the value of ( 130]) to 
zero. 

The convergence behavior of A is roughly evaluated as follows. For A = 2^/c 2 — 1— e, 
where < e < 1, the imaginary part of the roots of the characteristic equation of the 
expression for 1 < d < g of ( 131"|) scales as 0(e 1 ^ 2 ). The radius g that satisfies (132]) for 
given A is supposed to be in the same range as the period of the damped oscillation 
caused by the complex roots. This leads to g ~ 0(e~ l l 2 ), and yields an asymptotic 
relation A ~ 2y/c 2 - 1 - 0(g~ 2 ) for # > 1 (figure S]). 

Figure [5] shows the first eigenvalue (a) and IPR of the first eigenvector (b) for c\ = 3 
and C2 = 7 in the case of A = 0. Experimental results for N = 1000 ~ 32000 exhibit 
excellent agreement with the theoretical prediction. 

5. Approximation for the general case 

Let us consider a more general situation in which both pi and p 2 are 0(1). The 
framework developed in section [3] would in principle be valid even in such cases; one 
would be able to accurately evaluate the typical first eigenvalue by utilizing the solution 
of ( 1T21) and ( |T3l) . Unfortunately, this is difficult in practice. First of all, analytically 
finding the solution is a hopeless task. Even numerical methods using the standard 
discretization approach, with the current level of computational resources, have trouble 
in achieving enough accuracy because of quantization errors. Statistical fluctuations 
also prevent a sampling approach using population dynamics despite that it performs 
pretty well in evaluating the bulk profile of the asymptotic eigenvalue spectrum [T6l fTTj . 
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Figure 5. (Color online) Theoretical predictions and experimental results for the 
aggregated defect models of c\ = 3 and C2 = 7. (a) The first eigenvalue, (b) IPR of the 
first eigenvector. Symbols denote averages over 100 experiments for N = 1000 ~ 32000. 
Lines represent theoretical predictions. 



We will avoid such difficulties by taking an alternative strategy. Specifically, we 
will develop an approximate evaluation scheme that can be handled without solving the 
functional equations. This scheme does not suffer from either quantization errors or 
statistical fluctuations, although its estimate may be structurally biased. 



5.1. Effective medium approximation 

The first approximation involves restricting the variational functions in fll]) to those of 
the forms q(A, H) = 5(A — a)q(H) and q(A, H) = 5(A — a)q(H) as assumed in the single- 
degree model. We call this the effective medium approximation (EMA) since a similar 
scheme is referred to by this name in a study of evaluating the asymptotic eigenvalue 
spectrum [IT]. This approximation yields 

f / am 2 + Am?\ _ / m.2 + 1m\rh\ + m-i 
A = extr < c 



a? — 1 / \ a — a 



ci(m 2 - mf) + c\m\ c 2 (m 2 - rh\) + c^rh\ 

+Pl 7 ~ \~P2 7 ~ \-Af. [66) 

A — C\a X — c 2 a 



The implications of the variables are similar to those of (iTol) . and the extremization is 
carried out with respect to all variables except A. The extremization condition of (1331) 
yields the following self-consistent equations: 

1 



(34) 

,Vi(ci - 1) r 2 (c 2 - 1)\ _ 

mi = A(l - a ) — — + — — mi, (35) 

A — cid A — c 2 a I 
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2a f n(ci + r 2 (c 2 - 1) \ _ / ri(ci - 1) + r 2 (c 2 - 1) 



d 2 \ A — Cxa A — c 2 d / \ (A — c{a) 2 (A — c 2 a) 2 

m 2 , (36) 



r x ci r 2 c 2 d 2 



2^2 



where 



(A — cid) 2 (A — c 2 d) 2 (1 — a 
cim 2 + ci(ci - l)mf c 2 m 2 + c 2 (c 2 - l)m 2 

r\ ^ + ^ 2 n ^2 = 1; 37 

(A — Cia) 2 (A — c 2 a) 2 



n + (38) 



A — c%a A — c 2 a 

Similarly to the case of the single-degree model, equation (|35|) indicates that the 
solutions can be classified into two types depending on whether rh\ vanishes or not: 

• rh\ 7^ 0: Equation ( 1351) means that 

A(l - d 2 ) ( r \ (Cl + r2 x (C2 ~i } > | = 1. (39) 
\ A — c\a A — c 2 a / 

This and (13^1) together determine A and d. Inserting the determined A and a into 
fl3"6"|) and (13T|) yields frii and m 2 . We will refer to this estimate as the ferromagnetic 
approximation (FA). 

• fh\ — 0: Equation fl36|) means that 



(A-cid) 2 (A-c 2 d) 2 (l-d 2 ) 2 ' 

This and (13^1) determine A and a. Equation (1401) coincides with the critical condition 



of A that (JMj) possesses a solution with a complex d (see |Appendix CD , which gives 
the larger band edge of the asymptotic eigenvalue spectrum under EMA. Therefore, 
we will call this estimate the band edge approximation (BEA). Inserting the values 
of A and a into (l37j) yields m 2 . 



5.2. Aggregated defect approximation 

In addition to the above, the analysis of the defect models offers another criterion for 
the first eigenvalue. According to the cavity interpretation, a is an exemplary value of 
the cavity biases A^j. Therefore, the requirement that (126]) must not be negative for 
any node of the network leads to the condition, 

A = c 2 d, (41) 

which corresponds to the single-defect approximation (SDA) in the estimate of the 
eigenvalue spectrum [TT| [T2~]. However, this, being combined with ( )34l) and ( !38|) . always 
yields a solution of a = 1 and A = c 2 , which corresponds to the trivial upper bound of 
A for the current bimodal degree model. 

For improving on this result, we can generalize the SDA to higher level of 
approximation by replacing (14"T|) with ( 132]) and identifying the solution of (134|) as a* (A). 
We shall refer to the estimate based on this idea as the aggregated defect approximation 
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(ADA). A similar idea was mentioned in an earlier study on the eigenvalue spectrum 

Similarly to the argument presented in section 14.21 the estimate of the first 
eigenvalue becomes larger as g grows from 1 to infinity. In particular, the ADA estimate 
converges to that of the band edge solution of the single-degree model of c 2 , 2-y/c 2 — 1, 
as g oo. Aggregations of the larger degree nodes of arbitrary sizes appear with a 
probability of unity as A" tends to infinity as long as both pi and p 2 are 0(1). This 
indicates that 2a/c 2 — 1 is the appropriate estimate of ADA for the current model of 
A" — > oo irrespective of the details of the degree distribution. 

However, this does not mean that the estimate is practically relevant for explaining 
the results of experiments on computationally feasible system sizes. The number of 
nodes of the larger degree c 2 surrounding the center of an aggregated defect of radius g 
is n g = c 2 + c 2 (c 2 - 1) + c 2 (c 2 - l) 2 + . . . + c 2 (c 2 - l)*- 1 = c 2 ((c 2 - 1)* - l)/(c 2 - 2). 
Using this formula, the probability of a node being the center of the aggregated defect is 
P g ~ p 2 xr 2 9 , when bothpi andp 2 are 0(1). The typical size of the largest aggregation in 
a network of A" nodes can be roughly found using the condition AfP ffmax ~ 1. This yields 
n Smax ~ 0(ha.N) and therefore the maximum radius <? max typically scales as OQnlnAQ. 
This, in conjunction with the argument of section |4~2| indicates that the first eigenvalue 
behaves as A ~ 2y/c 2 — 1 — 0((ln In N)~ 2 ). The In ln-dependence on A" implies that A 
can be arbitrarily close to 2-y/c 2 — 1 as A" — > oo, but a very large A" is necessary for 
experimentally observing the convergent behavior. 

5.3. Comparison with experimental results 

The largest value among the estimates of FA, BEA, and ADA is an approximate estimate 
of A. To examine the utility of our approximation scheme, we compared the estimated 
values of A with the results of numerical experiments for the cases of c\ = 3, c 2 = 7, 
and pi = 1 — P2 = 0.9 by varying A" from 1000 to 32000. The results are depicted in 
Fig. [6] (a). Symbols represent the averages of the first eigenvalues for 100 realizations 
of matrices. 

As for the choice of parameters, BEA offers an estimate Abea = 3.9146. As shown 
in figure [6] (a), the estimate of FA, Afa, generally bifurcates from that of BEA at a 
critical value A c , which is evaluated as 0.6762 for the current parameter choice, as A 
grows larger from below. The results of the experiments show fairly good accordance 
with the estimate of FA as A approaches 1 in the region of A > A c . On the other hand, 
those for A < A c grow gradually as A" increases. This is probably because the typical 
size of the maximum aggregation of the larger degree nodes that dominates the first 
eigenvalue in the network increases very slowly, as estimated above. ADA estimates 
Aada to be 3.9676, 4.2119, and 4.8990 for g = 1, 2, and oo, respectively. The condition 
of NP gm ^ ~ 1 gives c/ max ~ 0.6281 ~ 0.8575 for N = 1000 ~ 32000. This implies that 
an ADA of g = 1 is closest to those of the experiments. Actually, it exhibits reasonable 
consistency with data on larger system sizes A" = 8000, 16000, and 32000, even though 
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Figure 6. (Color online) Theoretical predictions and experimental results in the cases 
of c\ — 3, C2 = 7, and pi = 1 — p2 = 0.9. (a): The first eigenvalue. Symbols represent 
averages over 100 experiments for N = 1000, 2000, 4000, 8000, 16000, and 32000 
systems from the bottom. Lines represent the theoretical predictions by BEA and 
AD As of g = 1 and 2 from the bottom while the curve stands for that by FA. The 
results of ADA indicate that the first eigenvalue converges to 2y/c2 — 1 = 4.8990 as 
N tends to infinity. However, due to the In ln-dcpcndence of g max on N, a very large 
N would be necessary for experimentally confirming the convergence, (b): IPR of the 
first eigenvector. Symbols represent averages over 100 experiments for N = 1000, 2000, 
4000, 8000, 16000, and 32000 systems from the top. Lines represent the theoretical 
predictions of ADAs of g = 1 and 2 from the top. 



the current estimate of g max is based on a rough argument. 

Figure [6] (b) plots the average of IPR for the first eigenvector. The results of the 
experiments (symbols) are considerably smaller than the theoretical predictions of ADA 
of g = 1, 2 (lines). When a network is randomly generated, multiple aggregations of the 
larger degree nodes appear simultaneously, which reduces the value of IPR. This may 
be the reason for the significant discrepancy between the theoretical and experimental 
results. 

6. Summary 

We investigated the properties of the first (maximum) eigenvalue and its eigenvector 
(first eigenvector) by using methods of statistical mechanics for sparse symmetric 
random matrices characterized by a bimodal degree distribution. Employing the replica 
method, we provided a general formula for evaluating the typical first eigenvalue in 
the large system size limit. Unfortunately, the replica-based scheme involves functional 
equations, which are difficult to solve accurately. Therefore, we developed approximate 
evaluation schemes based on the results for two solvable cases and techniques previously 
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proposed for estimating the eigenvalue spectrum. Our schemes are reasonably consistent 
with results of experiments when the statistical bias of the positive matrix entries is 
sufficiently large, and they qualitatively explain why considerably large finite size effects 
can be observed when the bias is relatively small. 

Promising future research includes an exploration of degree correlated models 
[T8| [38] as well as a refinement of the approximation schemes. 
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Appendix A. A proof of A < c 2 

The Perron-Frobenius theorem guarantees that the inequalities 
A = — max <v y JijViVj *> subj. to \v\^ = N 



N v 




subj. to \v\ = N 

subj. to \v\ 2 = N (A.l) 

hold for an arbitrary symmetric matrix VJ = (Jij)- Therefore, we only have to consider 
the cases in which all nonzero entries are unity. Given such a sample matrix J for which 
Npi nodes have degree c\ while the other Np2 nodes have degree c%, we shall add entries 
of unity, so as to make all nodes have degree c<i while keeping the matrix symmetric. 
We denote the resultant matrix J' = (</(,)• We also write the first eigenvector of J 
as V = (Vi), assuming a normalization of \V\ 2 = N. The Perron-Frobenius theorem 
ensures that VV^ is non-negative as well. This indicates that the inequality 

A = w £ ^ ^ E W 

subj. to \v\ 2 = N (A.2) 

holds since entries of V, J and J' are all non-negative and the number of nonzero 
entries of J' is larger than that of J. The last expression of (1A.2|) is maximized by 




v — (1,1, ... , 1) T for any realization of J', which yields N 1 J'ij v i v j = c 2- Therefore, 
A < C2 always holds for our ensemble of random matrices. 
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Appendix B. Replica approach to rinding the first eigenvalue 

Although we shall focus on the bimodal degree distribution for simplicity, extending the 
following calculation to general degree distributions is straightforward. To calculate 
the moment of the partition function ([3]), we first express the matrix entries as 
Jij = Jji = L(ij)B(ij), where (ij) denotes the unordered pair of i and j. is set 

to unity if there is a link for (ij), and it vanishes, otherwise, and Buj\ is a binary value 
sampled from (CEJ). Permutation symmetry in indexing the nodes allows us to choose a 
joint distribution of {L {ij) } £ {0, l}^- 1 )/ 2 , 

n Pi / 

PL ({L {ij) }) =M- 1 i[ 5 [T, L M 

i=i 

N Pl 




reflecting our assumptions on the graph generation. Here, M denotes a constant to 
normalize pi i = and we have utilized a contour integral expression 

5(x) = § dZ Z~( x+i y (2-Ki) for the inte ger x. The joint distribution of {B^} £ 
{+1, _i}iv(Ar-i)/2 is pB = Yl {ij)Pj (B {lj) \A) by definition. 

Next, we evaluate the average of Z n {J5\ J) with respect to these distributions by uti- 
lizing an identity Z n (/3; J) — J (Ua=i dv a 5(\v a \ 2 - A)) x exp ((/3/2) ^ =1 (v a ) T Jv a ) = 
I (FC=i dv a 8(\v a \ 2 - A)) x exp £™ =1 ^L {ij) B {ij) vfvp2^ . This identity is math- 

ematically valid only for n £ N. In this evaluation, the following expression appears: 

r t \ V" /fR nTTyE^iw) I P L (ij) B (ij) v i v j 

S(n)= }^ PB({B {ij )}) [[Z l ^ 'exp H 2^ 2 

{£ (tf) },{B w> } i=l V {ij) o=l 



l + Z&l Yl PAB {tl) \A)exp 



a=1 B {ij) =±1 



exp 



In J 1 + ZiZjexp 



n 



PBvfv? 



exp 2^ ^i^exp 2^ 



(ij) \a=l 



exp 



^ J du lQ ( Ul ) J du 2 Q(u 2 )ex V j , (B.2) 



where exp(/3 J BMiM 2 /2) = £) B=±1 A) exp(^Bu 1 u 2 /2), u k = (u\, u 2 k , . . . , v%) (k 
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1,2), and we have introduced an order parameter function, 

N n 



i=l a=l 

We shall also introduce a conjugate function Q(u) for utilizing an identity for Vit 
1 = f dQ(u)5 ( 1 jh Zt f[ 6(vt-u a ) - Q{u) J 

V i=l a=l / 



and employ another identity 

f3d\ a ( p\ 



* (l"T " *) = / ^exp - N . (B.5) 



47T 

\ \ i=l 

These, in conjunction with employment of the saddle point method for the integration 
with respect to Q{u), Q(u), and A a (a = 1,2, .. . , n), lead us to an expression for the 
average of Z n (0; J): 

lln[Z"(/?; J)]j = extr l^MG{n) - I duQ{u)Q{u) 
+ p 1 \Jj duexp (- £ j qc 1(u) 

+ p 2 \n^jdu exp ^- £ j {u) 

o=l J 

for n G N. 

In the calculation of C IB.6j) . we assume that the saddle point is dominated by 
functions of the form 

G(u) =T j dAdHq(A, H) 'exp (V - | U ,(B.7) 



and 



Q(u) = f J dAdHq(A, H) exp ^ {^{u a f - f3Hu a ^j j , (B.8) 

where T and T are normalization factors so as to make q(A, H) and q[A, H) distribution 
functions. We also assume that A a = A (a = 1, 2, . . . , n) holds at the dominant saddle 
point. These correspond to the replica symmetric ansatz [33] in the current system. 
The saddle point method gives N~ l In M = (c/2) ln(iVc) — p\ lncx! — P2 In c^.. Inserting 
these into (1B.6[) and extremizing the resultant expression with respect to T and T yields 

1 In [Z n (f3; J)]j = extr 1°- In (/Q [<?(•); n]) - cm (/C 2 [<?(•), ?(•); n]) 
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(B.9) 



where 



Kt [g(-);n] 



dArfH^A^Hi) J dA 2 dH 2 q(A 2 , H 2 ) 

A 2 Hf + 2JH 1 H 2 + A 1 Hi E\ H\ 



exp 

A^2 

^2 - 



n/2 



2{A X A 2 



2A 1 2A 2 



(B.10) 



IC2 [?(•), 5(0; "1 



dAdHq(A, H) J dAdHq(A, H) 
(H + H) 2 H 2 \ 



x exp n/3 



x 



3fe 



■),A;n] 



2(A - A) 2Ay j 

n/3 



n/2 



X 



Y[ dA^dH^q(A^ ) x exp 

n/2 

, * = 1,2) 



V 



B.ll) 

7 



(B.l 

Although we estimated [Z n ((3; J)} j for n G N with the saddle point 
method, the expressions flB.9j) - (1B.12j) are likely to hold for n G K as well. 
Therefore, we employ them to evaluate [A]j = 2 lim / 3_ i , 00 (/3A) _1 [In Z(f3; J)]j = 
2 lim^oo lim n _> o (0 /dn)((3N)~ l In [Z n {/3; J)]j, which yields 

Appendix C. Critical condition on emergence of complex solution for ( 1341) 

Let us consider a linear perturbation a — > a + i5a around a fixed point of ( 13"4"|) for a 
given A, which yields 

1 1 ffi A 1 



§-£ 2 Ua. (C.l) 



This means that a critical condition so that (1331) possesses a complex solution is provided 

as 



1 



1 



as 



Equation 



and 



(Sa + l) 2 V <9a 
indicates that 
OTj nci r 2 c 2 



da (A — c{a) 2 (A — c 2 a)' 



I -a 2 



(C.2) 

(C.3) 
(C.4) 



hold. Inserting these into (ICJ.2[) results in an expression equivalent to (I40p . 
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